Safe Autonomous Reinforcement Learning
نویسنده
چکیده
In the thesis we propose, we focus on equipping existing Reinforcement Learning algorithms with different kinds of safety constraints imposed on the exploration scheme. Common Reinforcement Learning algorithms are (sometimes implicitly) assumed to work in an ergodic1, or even “restartable” environment. However, these conditions are not achievable in field robotics, where the expensive robots can’t simply be replaced by a new functioning unit when they perform a “deadly” action. Even so, Reinforcement Learning offers many advantages over supervised learning that are useful in the robotics domain. It may reduce the amount of annotated training data needed to train a task, or, for example, eliminate the need of acquiring a model of the whole system. Thus we note there is a need for something that would allow for using Reinforcement Learning safely in non-ergodic and dangerous environments. Defining and recognizing safe and unsafe states/actions is a difficult task itself. Even when there is a safety classifier, it still remains to incorporate the safety measures into the Reinforcement Learning process so that efficiency and convergence of the algorithm is not lost. The proposed thesis deals both with safety-classifier creation and the usage of Reinforcement Learning and safety measures together. The available safe exploration methods range from simple algorithms for simple environments to sophisticated methods based on previous experience, state prediction or machine learning. Pitifully, the methods suitable for our field robotics case usually require a precise model of the system, which is however very difficult (or even impossible) to obtain from sensory input in unknown environment. In our previous work, for the safety classifier we proposed a machine learning approach utilizing a cautious simulator. For the connection of Reinforcement Learning and safety we further examine a There is no “black hole” state from which the agent could not escape by performing any action.
منابع مشابه
Safe Reinforcement Learning via Formal Methods Toward Safe Control Through Proof and Learning
Formal verification provides a high degree of confidence in safe system operation, but only if reality matches the verified model. Although a good model will be accurate most of the time, even the best models are incomplete. This is especially true in Cyber-Physical Systems because high-fidelity physical models of systems are expensive to develop and often intractable to verify. Conversely, rei...
متن کاملLeave no Trace: Learning to Reset for Safe and Autonomous Reinforcement Learning
Deep reinforcement learning algorithms can learn complex behavioral skills, but real-world application of these methods requires a large amount of experience to be collected by the agent. In practical settings, such as robotics, this involves repeatedly attempting a task, resetting the environment between each attempt. However, not all tasks are easily or automatically reversible. In practice, ...
متن کاملCompact Q-learning optimized for micro-robots with processing and memory constraints
Scaling down robots to miniature size introduces many new challenges including memory and program size limitations, low processor performance and low power autonomy. In this paper we describe the concept and implementation of learning of a safe-wandering task with the autonomous micro-robots, Alice. We propose a simplified reinforcement learning algorithm based on one-step Q-learning that is op...
متن کاملMachine Learning Models to Enhance the Science of Cognitive Autonomy
Intelligent Autonomous Systems (IAS) are highly cognitive, reflective, multitask-able, and effective in knowledge discovery. Examples of IAS include software systems that are capable of automatic reconfiguration, autonomous vehicles, network of sensors with reconfigurable sensory platforms, and an unmanned aerial vehicle (UAV) respecting privacy by deciding to turn off its camera when pointing ...
متن کاملSafe Exploration Techniques for Reinforcement Learning - An Overview
We overview different approaches to safety in (semi)autonomous robotics. Particularly, we focus on how to achieve safe behavior of a robot if it is requested to perform exploration of unknown states. Presented methods are studied from the viewpoint of reinforcement learning, a partially-supervised machine learning method. To collect training data for this algorithm, the robot is required to fre...
متن کامل